ordinary differential equation
Nyström-Accelerated Primal LS-SVMs: Breaking the O(an3) Complexity Bottleneck for Scalable ODEs Learning
A major problem of kernel-based methods (e.g., least squares support vector machines, LS-SVMs) for solving linear/nonlinear ordinary differential equations (ODEs) is the prohibitive O(an3) (a = 1 for linear ODEs and 27 for nonlinear ODEs) part of their computational complexity with increasing temporal discretization points n. We propose a novel Nyström-accelerated LS-SVMs framework that breaks this bottleneck by reformulating ODEs as primal-space constraints. Specifically, we derive for the first time an explicit Nyström-based mapping and its derivatives from one-dimensional temporal discretization points to a higher m-dimensional feature space (1 < m n), enabling the learning process to solve linear/nonlinear equation systems with m-dependent complexity. Numerical experiments on sixteen benchmark ODEs demonstrate: 1) 10 6000 times faster computation than classical LS-SVMs and physics-informed neural networks (PINNs), 2) comparable accuracy to LS-SVMs (< 0.13% relative MAE, RMSE, and y ˆy difference) while maximum surpassing PINNs by 72% in RMSE, and 3) scalability to n = 104 time steps with m = 50features. This work establishes a new paradigm for efficient kernel-based ODEs learning without significantly sacrificing the accuracy of the solution.
Derivations of Formulas
We have omitted a number of complicated formulas in the main text to provide clear intuition and concise proof sketch. We will list all mentioned formulas here for readers' reference. We consider the case where U = V = Aand Σ is symmetric and full-rank, and we use gradient flow. We can derive the dynamics of S = AA>as S:= (Σ S)S+ S(Σ S), which is a quadratic ordinary differential equation and it is hard to solve directly. For simplicity, define X:= X Σ 1. Then X = XΣ ΣX. (24) Solving this equation and we have And it is interesting to verify that S(t) + P(t) Σ by using the following lemma.
Improving Infinitely Deep Bayesian Neural Networks with Nesterov's Accelerated Gradient Method
As a representative continuous-depth neural network approach, stochastic differential equation (SDE)-based Bayesian neural networks (BNNs) have attracted considerable attention due to their solid theoretical foundations and strong potential for real-world applications. However, their reliance on numerical SDE solvers inevitably incurs a large number of function evaluations (NFEs), resulting in high computational cost and occasional convergence instability. To address these challenges, we propose a Nesterov-accelerated gradient (NAG) enhanced SDE-BNN model. By integrating NAG into the SDE-BNN framework along with an NFE-dependent residual skip connection, our method accelerates convergence and substantially reduces NFEs during both training and testing. Extensive empirical results show that our model consistently outperforms conventional SDE-BNNs across various tasks, including image classification and sequence modeling, achieving lower NFEs and improved predictive accuracy.